Skip to content

fix(openai-bridge): address PR #1429 review follow-ups (alias miss, error state, wire shape)#1442

Merged
slin1237 merged 15 commits into
mainfrom
fix/openai-bridge-review-followups
May 5, 2026
Merged

fix(openai-bridge): address PR #1429 review follow-ups (alias miss, error state, wire shape)#1442
slin1237 merged 15 commits into
mainfrom
fix/openai-bridge-review-followups

Conversation

@slin1237
Copy link
Copy Markdown
Collaborator

@slin1237 slin1237 commented May 4, 2026

Summary

PR #1429 left a number of CodeRabbit / Codex / Claude review comments unaddressed (or only partially addressed). This PR closes them, grouped by the bug each fix touches. Every fix has a regression test; the existing wire shape stays unchanged for the non-MCP and explicit-Passthrough cases.

# Bug Severity Comments addressed
1 Aliased builtin lookup miss + Passthrough downgrade leaves stale entry 🔴 + 🟠 claude 3174955420, claude 3174955773, codex 3174963504, coderabbit 3174799419, coderabbit 3174799424, coderabbit 3174987646, claude 3174778983, codex 3176332060
2 transform_tool_output discards is_error/error_message 🟠 coderabbit 3175392558
3 mcp_list_tools.tools[].annotations widened to full struct 🟠 coderabbit 3174799428, coderabbit 3174799434
4 Replayed structural input items lose content 🟠 coderabbit 3175392562
5 OpenAI streaming/non-streaming format-registry eager 500 🟠 coderabbit 3175392563
6 gRPC regular streaming output_item.done ad-hoc envelope 🟠 coderabbit 3174987653
7 mcp_test bypasses bridge helper 🟡 coderabbit 3175392565

False positive (no change): gemini 3174766561server() helper already lives inside #[cfg(test)] mod tests {}.
Skipped (perf micro-nit, not a hot path): gemini 3174766567.

Why these matter

  • (1) is the worst regression. McpToolSession::collect_visible_mcp_tools (crates/mcp/src/core/session.rs:565-571) replaces a direct entry with its alias entries when an alias is registered, so production session lookup of an aliased builtin queries ("alias", alias_name). The previous code only landed the format on (server, tool), so any aliased builtin silently degraded to mcp_call instead of the hosted shape (e.g. web_search_call / image_generation_call). The fix mirrors non-Passthrough formats on both keys and adds explicit downgrade-clears so a config that demotes a tool to Passthrough no longer leaves the stale hosted entry behind.
  • (2) is a correctness bug in failure replay: a failed hosted-tool execution was persisted and re-shipped to the model as status = "completed". The fix bypasses the success-path builders when is_error is set and emits a typed failure item. mcp_call carries error; the four hosted-builtin families convey failure via status: "failed" only (matches the OpenAI Responses spec).
  • (3) restores the wire-spec narrowing introduced by 1c82bca1 and reverted by the bridge move.
  • (4) the previous code only treated function_call/function_call_output as whole-item on the input side, so any other replayed structural item (image_generation_call, web_search_call, mcp_call, mcp_list_tools) collapsed to content: [] on persist, dropping the item from the next history load.
  • (5) the previous fail-fast check ran before the router knew whether the request would even use MCP routing, so plain non-MCP requests 500'd in deployments without MCP wiring. The check is now scoped to MCP-laden requests via a new request_uses_mcp_routing(tools) predicate; the safety contract is preserved exactly when it matters.
  • (6) the gRPC streaming output_item.done payload was hand-rolled with {name, status, arguments, output|error} — wrong shape for hosted-builtin variants (no name/arguments/output fields) and leaked an error field on hosted failures (only mcp_call carries error). Now serializes the typed item from transform_tool_output and reuses it for both the wire payload and state.record_call.

Test plan

  • cargo build -p smg clean
  • cargo clippy --all-targets --all-features -- -D warnings clean
  • cargo fmt --all -- --check clean
  • cargo test -p smg --lib openai_bridge:: — 50 pass (5 new regression tests added)
  • cargo test -p smg --test mcp_test — 23 pass (2 tests rerouted through transform_tool_output)
  • cargo test -p smg-mcp — 158 pass
  • cargo test -p smg --lib — 731 pass
  • CI green

Summary by CodeRabbit

  • Bug Fixes

    • More consistent failure reporting for tool executions (failed status, clearer error message, empty output where appropriate).
    • Streaming now emits the correct completion/failure events and persists a single, consistent output item.
  • Improvements

    • Response-format rules now correctly mirror/clear hosted formats and respect aliases.
    • Better detection/routing for MCP/built-in tools and simpler tool metadata (read-only) serialization.
    • Non-message items are now stored with full top-level fields when replayed.
  • Tests

    • Updated/added tests covering transformations, formats, annotations, and persistence.

slin1237 added 7 commits May 4, 2026 10:21
…on Passthrough downgrade

`McpToolSession::collect_visible_mcp_tools` (crates/mcp/src/core/session.rs:565-571)
*replaces* a direct tool entry with its alias entries when an alias is
registered. The session-side lookup path
(`qualified_name_for_exposed("web_search")` →
`("alias", "web_search")` → `registry.lookup`) therefore queries the
alias key — but `populate_from_server_config` was attaching the format
*only* to one of the two keys depending on whether the user supplied an
explicit format. Two consequences:

- An alias-only stanza on a builtin tool (e.g. `tools.do_search.alias =
  "web_search"`, no `response_format`) wrote the builtin default at the
  direct key only. The session resolved through the alias key and
  silently degraded to `mcp_call` instead of the hosted shape.
- A stanza with both `alias` and a concrete `response_format` wrote only
  the alias key. Direct dispatch via `(server, tool)` then degraded to
  Passthrough.

Mirror non-Passthrough formats on both keys whenever an alias exists —
both for explicit per-tool configs and for the builtin-default fallback.
And while we're here, make the documented "safe to call repeatedly"
contract real: `Some(Passthrough)` now explicitly clears the previous
hosted entry on every key (a config that flips a tool back from a
hosted format to passthrough no longer leaves the stale mapping behind).

Updates the existing alias-format test to assert both keys carry the
hosted shape, replaces the alias-only-builtin test with one that exercises
the alias-key lookup that production actually hits, and adds a new
regression test for the downgrade path.

Comments addressed (PR #1429): coderabbitai 3174799419 / 3174799424 /
3174987646, claude 3174955420 / 3174955773 / 3174778983,
chatgpt-codex-connector 3174963504 / 3176332060.

Signed-off-by: Simo Lin <[email protected]>
…m_tool_output

Previously `transform_tool_output` always delegated to the success-path
builders (`to_mcp_call`, `to_web_search_call`, etc.), which stamp
`status = "completed"` and `error = None`. A failed tool execution
therefore got persisted as a success and replayed as such on the next
turn — and on the streaming path emitted on the wire as a successful
item too.

Bypass the success builders when `output.is_error` is set and emit a
typed failure item instead:

- `mcp_call`: `status = "failed"`, `error = Some(error_message)`,
  preserving the call's tool name / arguments / server label so the
  client can correlate.
- Hosted-builtin variants (`web_search_call` / `code_interpreter_call`
  / `file_search_call` / `image_generation_call`): `status = Failed`
  only — those four families have no `error` field on the wire per the
  Responses API spec.

`smg_mcp::ToolExecutionOutput` is `#[non_exhaustive]`, which blocks
fixture construction in the router test crate. Add a
`pub fn new_for_test(...)` constructor on the type so external test
crates can fabricate one without spinning up a real MCP server. The
helper is annotated with `#[expect(clippy::too_many_arguments)]` and
documented as test-only — adding a field to the struct doesn't break
it, only adding a *required* field would, which is a deliberate signal.

Adds two regression tests: one for the `mcp_call` failure shape and
one parameterized over all four hosted-builtin variants asserting
`status = "failed"` with no `error` field on the wire.

Comment addressed (PR #1429): coderabbitai 3175392558.

Signed-off-by: Simo Lin <[email protected]>
…nly}

PR #1429 inadvertently reverted the wire-spec narrowing introduced by
1c82bca (\"refactor(grpc): unify Responses surfaces\"). That earlier
commit explicitly cut down `mcp_list_tools[].tools[].annotations` to
`{\"read_only\": …}` to match OpenAI's Responses API surface, after
which the bridge move re-serialized the full SMG `ToolAnnotations`
struct (which carries `destructive`/`idempotent`/`open_world` for
internal approval-pipeline use).

Restore the narrowed shape and add a regression test that asserts the
serialized field is exactly `{"read_only": <bool>}`. The richer
annotations stay readable internally via `ToolEntry::annotations`; only
the wire form changes.

Comments addressed (PR #1429): coderabbitai 3174799428 / 3174799434.

Signed-off-by: Simo Lin <[email protected]>
`item_to_new_conversation_item` only treated `function_call` /
`function_call_output` as whole-item on the input side. Other
structural input items (replayed `image_generation_call`,
`web_search_call`, `mcp_call`, `mcp_list_tools`, etc. on a multi-turn
turn carrying prior output back in as input) fell through to
`item_value.get("content").cloned().unwrap_or(json!([]))`. Those types
have no `content` field — the wrapper became `[]`, and the next
history load lost the original item.

Unify the rule to `item_type != "message"` for both sides. Non-message
items have no `content` field on the wire either way, so the input
side now matches the output side's existing whole-item store. Adds two
regression tests: one for the structural input item case and one
asserting message items still extract their `content` array
(message-side behavior is unchanged).

Comment addressed (PR #1429): coderabbitai 3175392562.

Signed-off-by: Simo Lin <[email protected]>
…ests

The previous fail-fast on `mcp_format_registry()` ran before the router
knew whether the request would even enter the MCP interception path.
Plain streaming requests that should pass through
`handle_simple_streaming_passthrough()` (and the non-streaming
equivalent) now returned 500 in deployments where the gateway runs
without MCP wiring.

Hoist the registry resolution into the same arm that decides whether
the request needs MCP routing. The arm:

- Detects MCP-laden requests via a new
  `request_uses_mcp_routing(tools)` predicate (any of `Mcp`,
  `WebSearchPreview`, `CodeInterpreter`, `ImageGeneration`).
- Hard-fails with 500 when an MCP-laden request hits a router with no
  registry component — the original safety contract is preserved
  exactly when it matters.
- Returns `(mcp_servers, registry)` as a tuple so the cloned registry
  follows `mcp_servers` into the interception path without a second
  Option lookup downstream.

Plain (non-MCP) requests no longer touch the registry at all and
succeed regardless of whether it's wired up.

Comment addressed (PR #1429): coderabbitai 3175392563.

Signed-off-by: Simo Lin <[email protected]>
The gRPC regular streaming path hand-rolled an ad-hoc envelope
(`{id, type, name, status, arguments, output|error}`) for the
`output_item.done` SSE payload, then SEPARATELY built the typed item
via `transform_tool_output(...)` only for `state.record_call`. For
hosted-builtin formats this diverged from the persisted item:

- Hosted variants (`web_search_call`/`code_interpreter_call`/
  `file_search_call`/`image_generation_call`) have no `name` /
  `arguments` / `output` fields — the ad-hoc envelope shipped fields
  the wire shape doesn't recognize.
- Failures leaked an `error` field even though only `mcp_call` carries
  one in the Responses API spec; the four hosted families convey
  failure via `status: "failed"` alone.
- `mcp_call.failed` was emitted for every format, including the
  hosted-builtin families that don't have a `*.failed` event in the
  event taxonomy (only `response.mcp_call.failed` exists).

Build the typed item once via `transform_tool_output` (which now
preserves failure state — see preceding commit), serialize it, override
the `id` so `output_item.done` matches the streaming-allocated id used
by the earlier `output_item.added`, and reuse the same item for both
the SSE payload and `state.record_call`. Restrict
`emit_mcp_call_failed` to `Passthrough`; hosted-builtin failure now
emits the format-specific `*.completed` event followed by
`output_item.done` carrying the failed typed item.

Comment addressed (PR #1429): coderabbitai 3174987653.

Signed-off-by: Simo Lin <[email protected]>
…ool_output

The two `test_web_search_transform_*_with_mock` integration tests
hand-fed `ResponseTransformer::transform` with hardcoded
call_id/server_label/tool_name/arguments strings. A regression in
session-side rewriting (which lives on `ToolExecutionOutput.{call_id,
server_label, tool_name, arguments_str}`) would still leave these
assertions green.

Route the assertions through `openai_bridge::transform_tool_output`
instead, which consumes the actual `ToolExecutionOutput` returned by
`session.execute_tool(...)`. Same coverage of the WebSearchCall shape,
plus catches future drift in the session→bridge field plumbing.

Comment addressed (PR #1429): coderabbitai 3175392565.

Signed-off-by: Simo Lin <[email protected]>
@github-actions github-actions Bot added grpc gRPC client and router changes mcp MCP related changes tests Test changes model-gateway Model gateway crate changes openai OpenAI router changes labels May 4, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f28fbc92-8616-4f75-88ee-c6566c567c14

📥 Commits

Reviewing files that changed from the base of the PR and between e9cb555 and 4ef4bcd.

📒 Files selected for processing (2)
  • model_gateway/src/routers/common/openai_bridge/transformer.rs
  • model_gateway/src/routers/grpc/regular/responses/streaming.rs

📝 Walkthrough

Walkthrough

Added a test-only ToolExecutionOutput::new_for_test, introduced MCP routing detection helper, changed format registry alias/mirroring and builtin handling, simplified MCP annotation serialization, added passthrough-failure shaping, changed whole-item persistence behavior, and made MCP component wiring lazy in streaming and non-streaming paths.

Changes

MCP routing, formats, tool output, persistence, and streaming

Layer / File(s) Summary
Test Foundation
crates/mcp/src/core/orchestrator.rs
Adds pub fn ToolExecutionOutput::new_for_test(...) to construct a fully populated ToolExecutionOutput from explicit inputs (string-like args converted via Into<String>).
Routing Detection
model_gateway/src/routers/common/mcp_utils.rs
Adds pub fn request_uses_mcp_routing(tools: &[ResponseTool]) -> bool that detects MCP entries or hosted builtin tool families.
Format Registry: data & mutation
model_gateway/src/routers/common/openai_bridge/format_registry.rs
Treats per-tool response_format: Some(Passthrough) as an explicit removal of hosted mappings (direct and alias keys); mirrors non-Passthrough hosted formats onto both direct and alias keys; builtin defaults are applied only when per-tool response_format is absent and are mirrored for aliases.
Tool descriptors / annotations
model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs
build_mcp_tool_infos now emits annotations: Some({ "read_only": <bool> }) derived from entry.tool.annotations.read_only_hint (default false) instead of serializing the full ToolAnnotations.
Tool output transformation (error passthrough)
model_gateway/src/routers/common/openai_bridge/transformer.rs
When output.is_error and response_format == Passthrough, return a failure-shaped ResponseOutputItem::McpCall (status:"failed", error: Some(msg), output:""). Hosted-builtin formats no longer special-case is_error. Adds helper and unit tests.
Persistence shape & control flow
model_gateway/src/routers/common/persistence_utils.rs
item_to_json supports whole-item store by hoisting top-level fields from object content; item_to_new_conversation_item now sets store_whole_item = item_type != "message", causing non-message items to persist the full object for both inputs/outputs. Tests added for round-trip behaviors.
Streaming emission & recording
model_gateway/src/routers/grpc/regular/responses/streaming.rs
Compute output_item once, serialize into item_done (forced to streaming-allocated item_id) and attach server label; emit format-aware events (e.g., mcp_call.failed for passthrough failures); remove redundant re-transformation and ensure state.record_call(...) records the streamed output_item.
Lazy MCP wiring in handlers
model_gateway/src/routers/openai/responses/{non_streaming.rs,streaming.rs}
MCP client/registry/orchestrator resolved only when original_body.tools exists and request_uses_mcp_routing(tools) is true; otherwise mcp_routing is None and non-MCP forwarding runs without MCP components.
Tests / Integration
model_gateway/tests/mcp_test.rs
Web-search transformation tests updated to call transform_tool_output(&output, ResponseFormat::WebSearchCall) using actual tool execution outputs instead of manually constructed transformer inputs.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ModelGateway
    participant FormatRegistry
    participant MCP_Orchestrator
    participant Persistence
    Client->>ModelGateway: Request with tools
    ModelGateway->>ModelGateway: request_uses_mcp_routing(tools)
    alt MCP routing required
        ModelGateway->>FormatRegistry: ensure/lookup formats
        ModelGateway->>MCP_Orchestrator: ensure_request_mcp_client
        MCP_Orchestrator->>ModelGateway: tool execution result
        ModelGateway->>ModelGateway: transform_tool_output(output, response_format)
        alt Passthrough && output.is_error
            ModelGateway->>Client: emit mcp_call.failed (status: failed)
        else
            ModelGateway->>Persistence: serialize output_item -> item_done (force id)
            ModelGateway->>Client: emit tool_call.completed + output_item.done
        end
    else No MCP routing
        ModelGateway->>Client: forward without MCP wiring
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • CatherineSue
  • key4ng
  • zhaowenzi
  • zhoug9127

Poem

A rabbit crafts a test constructor bright,
Mirrors formats, sets errors right,
Hoists whole items so fields don't hide,
Wires MCP only when tools decide,
Hopping through code with joyous bite 🐇

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the pull request as addressing multiple review follow-ups from PR #1429, covering alias resolution, error state handling, and wire shape fixes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/openai-bridge-review-followups

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all 10 changed files (575+/146-). The six fixes are correct and well-tested:

  1. Alias key mirroring — format registry now writes to both direct and alias keys, with proper remove() on Passthrough downgrade. Tests cover round-trip.
  2. Error state handlingfailed_output_item correctly separates mcp_call (carries error field) from hosted-builtins (status-only), matching the Responses spec.
  3. Annotation narrowingbuild_mcp_tool_infos now emits only {"read_only": …} matching the wire spec.
  4. Persistence fix — unifies input/output whole-item storage for non-message types, fixing round-trip loss of structural items.
  5. Format-registry fail-fast scoperequest_uses_mcp_routing correctly mirrors extract_builtin_types + Mcp; plain requests no longer 500.
  6. gRPC streaming — reuses transform_tool_output for both wire and state, eliminating shape divergence; event taxonomy (*.completed + output_item.done with status: failed) is correct for hosted-builtins.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b6bf495ac3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +103 to +105
annotations: Some(json!({
"read_only": entry.annotations.read_only,
})),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve wire read_only from MCP tool annotations

build_mcp_tool_infos now emits annotations.read_only from entry.annotations, but ToolEntry::from_server_tool initializes that field with defaults and the server-loading path never copies tool.annotations into it. This means mcp_list_tools can report read_only: false even when the MCP server advertised a read-only tool, regressing client-visible metadata accuracy for every tool loaded via load_server_inventory.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances MCP tool integration by ensuring consistent response formatting and error handling across streaming and non-streaming interfaces. Key improvements include fixing a persistence bug for structural input items, mirroring format configurations for aliased tools, and making the format registry requirement conditional. Feedback focuses on preventing potential deadlocks in the format registry and using constants for default error strings.

Comment thread model_gateway/src/routers/common/openai_bridge/format_registry.rs
Comment thread model_gateway/src/routers/common/openai_bridge/transformer.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
model_gateway/src/routers/grpc/regular/responses/streaming.rs (1)

752-837: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Patch the typed item before recording it.

transform_tool_output(...) builds ids from tool_output.call_id (see model_gateway/src/routers/common/openai_bridge/transformer.rs:29-105), while item_id comes from allocate_output_index(...). Here you only overwrite the serialized item_done["id"], so state.record_call(..., output_item, ...) still stores a different id than the client saw on output_item.added/output_item.done.

That breaks the new “single source of truth” invariant and can leak mismatched item ids into persisted or replayed state.

Suggested fix
-                let output_item =
+                let mut output_item =
                     openai_bridge::transform_tool_output(&tool_output, response_format);
-                let mut item_done = serde_json::to_value(&output_item).unwrap_or_else(|e| {
+                overwrite_response_output_item_id(&mut output_item, &item_id);
+                let mut item_done = serde_json::to_value(&output_item).unwrap_or_else(|e| {
                     warn!(
                         tool = %tool_output.tool_name,
                         error = %e,
                         "Failed to serialize transformed output item; falling back to a minimal stub",
                     );
                     json!({
                         "id": item_id,
                         "type": item_type,
                         "status": if success { "completed" } else { "failed" },
                     })
                 });
-                // Preserve the streaming-allocated id so `output_item.done`
-                // matches the earlier `output_item.added`.
-                if let Some(obj) = item_done.as_object_mut() {
-                    obj.insert("id".to_string(), json!(&item_id));
-                }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/regular/responses/streaming.rs` around lines
752 - 837, The transformed typed output (output_item from transform_tool_output)
still contains the original id derived from tool_output.call_id while only the
serialized item_done had its "id" overwritten with allocate_output_index's
item_id; before calling state.record_call(...) update/patch output_item so its
id matches item_id (the same value inserted into item_done) so stored state uses
the same id the client saw; locate transform_tool_output, the local variables
output_item, item_id, item_done, and the state.record_call(...) invocation and
mutate or rebuild output_item's id field to equal item_id just prior to the
state.record_call call.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/common/mcp_utils.rs`:
- Around line 250-264: The function request_uses_mcp_routing currently infers
MCP usage from tool variants; change it to consult each tool's
resolved/configured routing instead of a hard-coded enum match: in
request_uses_mcp_routing(iterating ResponseTool items) call the tool's
routing/resolution accessor (or resolve its Route/Endpoint) and return true only
if that resolved route is an MCP-route (including hosted builtin servers that
are actually routed via MCP) — also ensure the file_search family is considered
by its resolved routing; update logic to use the tool's routing field/method
(e.g., the tool.resolve_routing() or ResponseTool::routing / Route enum) rather
than matching on
ResponseTool::WebSearchPreview/CodeInterpreter/ImageGeneration/etc.

In `@model_gateway/src/routers/common/openai_bridge/transformer.rs`:
- Around line 46-66: failed_output_item currently substitutes a generic "Tool
execution failed" and clears the tool output when output.error_message is None,
discarding real MCP-declared error text present in output.output; update
failed_output_item (the ResponseOutputItem::McpCall branch) so that err_msg is
set to output.error_message.clone().unwrap_or_else(||
output.output.clone().unwrap_or_else(|| "Tool execution failed".to_string())),
and preserve the original tool output by setting the McpCall.output field to
output.output.clone() (instead of String::new()); this ensures real error text
from smg_mcp::ToolExecutionOutput.output is used when error_message is absent.

In `@model_gateway/src/routers/common/persistence_utils.rs`:
- Around line 330-339: The write-side change sets store_whole_item = item_type
!= "message" but item_to_json() still sends unknown non-message types through
the default "content" branch, so restore loses top-level fields; update
item_to_json (and any helper used to format items) to check the same condition
(e.g. item_type != "message" or a passed-in store_whole_item flag) and, for
non-message types like image_generation_call/web_search_call/mcp_call, serialize
and return the full item JSON (preserving top-level fields such as
revised_prompt/status) instead of wrapping it under a "content" object; reuse
the existing whole-item serialization logic used on write or pass the original
item map through so round-trips preserve the original shape.

In `@model_gateway/src/routers/openai/responses/non_streaming.rs`:
- Around line 77-90: The code eagerly calls ctx.components.mcp_orchestrator()
for every request; move that orchestrator lookup into the guarded branch so it
is only retrieved when original_body.tools.as_deref() is Some and
request_uses_mcp_routing(tools) is true. Concretely, remove any prior call to
ctx.components.mcp_orchestrator() outside this match and call
ctx.components.mcp_orchestrator() inside the Some(tools) => { ... } arm right
before ensure_request_mcp_client(...) so the orchestrator is only required for
MCP-routed requests while plain non-tool requests can fall through to the direct
upstream path. Ensure you still call ctx.components.mcp_format_registry() as
currently done and pass the retrieved orchestrator into
ensure_request_mcp_client(...) as before.

In `@model_gateway/src/routers/openai/responses/streaming.rs`:
- Around line 1089-1110: The code currently calls
ctx.components.mcp_orchestrator() unconditionally earlier, which can cause 500s
for non-MCP requests; change the logic so the orchestrator is retrieved lazily
inside the branch that handles MCP routing: move the
ctx.components.mcp_orchestrator() lookup into the Some(tools) if
request_uses_mcp_routing(tools) branch, verify it exists (returning the same
internal_error if missing), then call
ensure_request_mcp_client(&mcp_orchestrator, &registry, tools).await and map to
(servers, registry) as before so non-MCP paths never require the orchestrator;
update any upstream bindings that referenced mcp_orchestrator to expect it only
when mcp_routing is Some.

---

Outside diff comments:
In `@model_gateway/src/routers/grpc/regular/responses/streaming.rs`:
- Around line 752-837: The transformed typed output (output_item from
transform_tool_output) still contains the original id derived from
tool_output.call_id while only the serialized item_done had its "id" overwritten
with allocate_output_index's item_id; before calling state.record_call(...)
update/patch output_item so its id matches item_id (the same value inserted into
item_done) so stored state uses the same id the client saw; locate
transform_tool_output, the local variables output_item, item_id, item_done, and
the state.record_call(...) invocation and mutate or rebuild output_item's id
field to equal item_id just prior to the state.record_call call.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: cd3fa90d-5e4e-4b74-8ed3-cae71465412c

📥 Commits

Reviewing files that changed from the base of the PR and between fe4faa9 and b6bf495.

📒 Files selected for processing (10)
  • crates/mcp/src/core/orchestrator.rs
  • model_gateway/src/routers/common/mcp_utils.rs
  • model_gateway/src/routers/common/openai_bridge/format_registry.rs
  • model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs
  • model_gateway/src/routers/common/openai_bridge/transformer.rs
  • model_gateway/src/routers/common/persistence_utils.rs
  • model_gateway/src/routers/grpc/regular/responses/streaming.rs
  • model_gateway/src/routers/openai/responses/non_streaming.rs
  • model_gateway/src/routers/openai/responses/streaming.rs
  • model_gateway/tests/mcp_test.rs

Comment thread model_gateway/src/routers/common/mcp_utils.rs
Comment thread model_gateway/src/routers/common/openai_bridge/transformer.rs Outdated
Comment thread model_gateway/src/routers/common/persistence_utils.rs Outdated
Comment thread model_gateway/src/routers/openai/responses/non_streaming.rs
Comment thread model_gateway/src/routers/openai/responses/streaming.rs Outdated
Drop narration that explained what the code does, plus PR/commit
references that will rot. Keep the cross-file invariants and wire-spec
rules that aren't expressed elsewhere — alias-key vs direct-key
mirroring, hosted-builtin failure shape, mcp_call.failed event scope,
non-message items having no `content` field on the wire, and
output_item.done id-override rationale.

Signed-off-by: Simo Lin <[email protected]>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9bbdc28104

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +45 to +48
let err_msg = output
.error_message
.clone()
.unwrap_or_else(|| "Tool execution failed".to_string());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve MCP error payload when error_message is absent

When a tool reports is_error = true without an explicit error_message, this path now replaces the error with the generic string "Tool execution failed", which drops actionable failure details from the tool output. smg_mcp::McpOrchestrator::execute_tool_entry_result can produce exactly that state (is_error from raw_result, error_message: None), so failed MCP calls that only return structured error text in output lose their real reason in transformed mcp_call.error, regressing client-visible diagnostics compared to the prior output_str fallback in the streaming path.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (4)
model_gateway/src/routers/openai/responses/streaming.rs (1)

1089-1105: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

The orchestrator lookup is still unconditional.

ctx.components.mcp_orchestrator() is resolved before this request_uses_mcp_routing(tools) guard, so a plain non-MCP streaming request still returns internal_error in MCP-disabled deployments. The lookup needs to move into the guarded branch with the format-registry check.

Suggested fix
-    let mcp_orchestrator = match ctx.components.mcp_orchestrator() {
-        Some(m) => m.clone(),
-        None => {
-            return error::internal_error("internal_error", "MCP orchestrator required");
-        }
-    };
     // Only MCP-laden requests need the format registry; plain streaming
     // requests must still pass through deployments without MCP wiring.
     let mcp_routing = match original_body.tools.as_deref() {
         Some(tools) if request_uses_mcp_routing(tools) => {
+            let Some(mcp_orchestrator) = ctx.components.mcp_orchestrator().cloned() else {
+                return error::internal_error("internal_error", "MCP orchestrator required");
+            };
             let Some(registry) = ctx.components.mcp_format_registry() else {
                 return error::internal_error(
                     "internal_error",
                     "MCP format registry required for requests carrying MCP/builtin tools",
                 );
             };
             let registry = registry.clone();
             ensure_request_mcp_client(&mcp_orchestrator, &registry, tools)
                 .await
-                .map(|servers| (servers, registry))
+                .map(|servers| (servers, registry, mcp_orchestrator))
         }
         _ => None,
     };
 
-    let Some((mcp_servers, mcp_format_registry)) = mcp_routing else {
+    let Some((mcp_servers, mcp_format_registry, mcp_orchestrator)) = mcp_routing else {
         return handle_simple_streaming_passthrough(&client, &worker, headers.as_ref(), req).await;
     };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@model_gateway/src/routers/openai/responses/streaming.rs` around lines 1089 -
1105, Currently ctx.components.mcp_orchestrator() is resolved unconditionally
causing internal errors for non-MCP streaming requests; move the orchestrator
lookup into the branch guarded by request_uses_mcp_routing(tools) and after
confirming ctx.components.mcp_format_registry() exists, so only requests with
MCP/builtin tools call ctx.components.mcp_orchestrator(), then pass that
orchestrator into ensure_request_mcp_client along with the cloned registry and
tools (references: original_body.tools, request_uses_mcp_routing,
mcp_format_registry, mcp_orchestrator, ensure_request_mcp_client).
model_gateway/src/routers/openai/responses/non_streaming.rs (1)

66-80: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

The MCP component lookup is still too eager here.

ctx.components.mcp_orchestrator() is still required before you know the request actually needs MCP routing, so plain non-tool requests can still fail with internal_error instead of taking the direct upstream path. Move the orchestrator lookup into this guarded branch as well.

Suggested fix
-    let mcp_orchestrator = match ctx.components.mcp_orchestrator() {
-        Some(m) => m,
-        None => {
-            return error::internal_error("internal_error", "MCP orchestrator required");
-        }
-    };
-
     // Only MCP-laden requests need the format registry; without this
     // narrowing, plain non-MCP requests would 500 in deployments that run
     // the gateway without MCP wiring. A registry-less MCP request still
     // hard-fails — silent fallback would mis-route hosted tools.
     let mcp_routing = match original_body.tools.as_deref() {
         Some(tools) if request_uses_mcp_routing(tools) => {
+            let Some(mcp_orchestrator) = ctx.components.mcp_orchestrator().cloned() else {
+                return error::internal_error("internal_error", "MCP orchestrator required");
+            };
             let Some(registry) = ctx.components.mcp_format_registry() else {
                 return error::internal_error(
                     "internal_error",
                     "MCP format registry required for requests carrying MCP/builtin tools",
                 );
             };
-            ensure_request_mcp_client(mcp_orchestrator, registry, tools)
+            ensure_request_mcp_client(&mcp_orchestrator, registry, tools)
                 .await
-                .map(|servers| (servers, registry.clone()))
+                .map(|servers| (servers, registry.clone(), mcp_orchestrator))
         }
         _ => None,
     };
 
-    if let Some((mcp_servers, mcp_format_registry)) = mcp_routing {
+    if let Some((mcp_servers, mcp_format_registry, mcp_orchestrator)) = mcp_routing {
         let session_request_id = original_body
             .request_id
             .clone()
             .unwrap_or_else(|| format!("req_{}", uuid::Uuid::now_v7()));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@model_gateway/src/routers/openai/responses/non_streaming.rs` around lines 66
- 80, The current code eagerly obtains ctx.components.mcp_orchestrator() before
checking whether the request needs MCP routing, which can cause non-tool
requests to error; change the logic so that you only call
ctx.components.mcp_orchestrator() inside the guarded branch where tools are Some
and request_uses_mcp_routing(tools) is true and after confirming
ctx.components.mcp_format_registry() exists; then pass that orchestrator into
ensure_request_mcp_client(...) (instead of using a pre-fetched mcp_orchestrator)
so non-MCP requests can take the direct upstream path without failing.
model_gateway/src/routers/common/openai_bridge/transformer.rs (1)

41-58: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Passthrough failures still discard MCP-provided error text.

When error_message is None, this falls back to "Tool execution failed" and clears McpCall.output, even though ToolExecutionOutput.output can already contain the real MCP error payload. That leaves clients with a generic failure and no actionable detail.

Suggested fix
 fn failed_output_item(
     output: &smg_mcp::ToolExecutionOutput,
     response_format: ResponseFormat,
 ) -> ResponseOutputItem {
     let err_msg = output
         .error_message
         .clone()
-        .unwrap_or_else(|| "Tool execution failed".to_string());
+        .filter(|msg| !msg.is_empty())
+        .or_else(|| {
+            let text = ResponseTransformer::flatten_mcp_output(&output.output);
+            (!text.is_empty()).then_some(text)
+        })
+        .unwrap_or_else(|| "Tool execution failed".to_string());
     match response_format {
         ResponseFormat::Passthrough => ResponseOutputItem::McpCall {
             id: mcp_response_item_id(&output.call_id),
             status: "failed".to_string(),
             approval_request_id: None,
             arguments: output.arguments_str.clone(),
             error: Some(err_msg),
             name: output.tool_name.clone(),
-            output: String::new(),
+            output: ResponseTransformer::flatten_mcp_output(&output.output),
             server_label: output.server_label.clone(),
         },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@model_gateway/src/routers/common/openai_bridge/transformer.rs` around lines
41 - 58, The Passthrough branch in failed_output_item is discarding the MCP
error payload by setting ResponseOutputItem::McpCall.output to an empty string
and using a generic err_msg when output.error_message.is_none(); change the
logic so failed_output_item (function) uses output.output.clone() (from
ToolExecutionOutput.output) as the McpCall.output when output.error_message is
None (or append it alongside error_message), and keep error set to the concrete
error_message when present (falling back to the generic message only for the
error field), ensuring ResponseOutputItem::McpCall.output carries the
MCP-provided payload instead of being cleared.
model_gateway/src/routers/common/persistence_utils.rs (1)

330-346: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Structural items still don't round-trip in their original shape.

This now stores non-message items whole, but item_to_json() still sends unknown non-message types through the default content branch. A replayed image_generation_call therefore still comes back as {"type":"image_generation_call","content":{...}} instead of restoring top-level fields like revised_prompt and status.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@model_gateway/src/routers/common/persistence_utils.rs` around lines 330 -
346, The persistence path stores non-message items whole (store_whole_item when
item_type != "message") but item_to_json still nests unknown non-message types
under a content field; update the item_to_json logic to return
item_value.clone() unchanged for non-message types instead of falling through to
the default content branch, while keeping the existing message handling (the
branch that wraps { "content", "phase" } when item_type == "message" and phase
is present). Ensure the checks reference the same symbols (item_to_json,
store_whole_item / item_type, and item_value) so replayed structural items like
image_generation_call restore their original top-level fields rather than
becoming {"type": "...", "content": {...}}.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@model_gateway/src/routers/common/openai_bridge/transformer.rs`:
- Around line 41-58: The Passthrough branch in failed_output_item is discarding
the MCP error payload by setting ResponseOutputItem::McpCall.output to an empty
string and using a generic err_msg when output.error_message.is_none(); change
the logic so failed_output_item (function) uses output.output.clone() (from
ToolExecutionOutput.output) as the McpCall.output when output.error_message is
None (or append it alongside error_message), and keep error set to the concrete
error_message when present (falling back to the generic message only for the
error field), ensuring ResponseOutputItem::McpCall.output carries the
MCP-provided payload instead of being cleared.

In `@model_gateway/src/routers/common/persistence_utils.rs`:
- Around line 330-346: The persistence path stores non-message items whole
(store_whole_item when item_type != "message") but item_to_json still nests
unknown non-message types under a content field; update the item_to_json logic
to return item_value.clone() unchanged for non-message types instead of falling
through to the default content branch, while keeping the existing message
handling (the branch that wraps { "content", "phase" } when item_type ==
"message" and phase is present). Ensure the checks reference the same symbols
(item_to_json, store_whole_item / item_type, and item_value) so replayed
structural items like image_generation_call restore their original top-level
fields rather than becoming {"type": "...", "content": {...}}.

In `@model_gateway/src/routers/openai/responses/non_streaming.rs`:
- Around line 66-80: The current code eagerly obtains
ctx.components.mcp_orchestrator() before checking whether the request needs MCP
routing, which can cause non-tool requests to error; change the logic so that
you only call ctx.components.mcp_orchestrator() inside the guarded branch where
tools are Some and request_uses_mcp_routing(tools) is true and after confirming
ctx.components.mcp_format_registry() exists; then pass that orchestrator into
ensure_request_mcp_client(...) (instead of using a pre-fetched mcp_orchestrator)
so non-MCP requests can take the direct upstream path without failing.

In `@model_gateway/src/routers/openai/responses/streaming.rs`:
- Around line 1089-1105: Currently ctx.components.mcp_orchestrator() is resolved
unconditionally causing internal errors for non-MCP streaming requests; move the
orchestrator lookup into the branch guarded by request_uses_mcp_routing(tools)
and after confirming ctx.components.mcp_format_registry() exists, so only
requests with MCP/builtin tools call ctx.components.mcp_orchestrator(), then
pass that orchestrator into ensure_request_mcp_client along with the cloned
registry and tools (references: original_body.tools, request_uses_mcp_routing,
mcp_format_registry, mcp_orchestrator, ensure_request_mcp_client).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e1ea2c2e-5759-4884-9447-b3aaafa929c8

📥 Commits

Reviewing files that changed from the base of the PR and between b6bf495 and 9bbdc28.

📒 Files selected for processing (10)
  • crates/mcp/src/core/orchestrator.rs
  • model_gateway/src/routers/common/mcp_utils.rs
  • model_gateway/src/routers/common/openai_bridge/format_registry.rs
  • model_gateway/src/routers/common/openai_bridge/tool_descriptors.rs
  • model_gateway/src/routers/common/openai_bridge/transformer.rs
  • model_gateway/src/routers/common/persistence_utils.rs
  • model_gateway/src/routers/grpc/regular/responses/streaming.rs
  • model_gateway/src/routers/openai/responses/non_streaming.rs
  • model_gateway/src/routers/openai/responses/streaming.rs
  • model_gateway/tests/mcp_test.rs

slin1237 added 5 commits May 4, 2026 11:30
ToolEntry::from_server_tool initialised .annotations to default and
never copied tool.annotations across, so every server-loaded tool
reported read_only=false regardless of what the MCP server advertised.
Combined with the read_only narrowing in the bridge, mcp_list_tools
on the wire became uniformly read_only=false even for tools the server
explicitly marked read-only.

Lift via ToolAnnotations::from_rmcp_option so the read_only/destructive/
idempotent/open_world hints flow through the way the approval pipeline
already expects.

Comment addressed (PR #1442): chatgpt-codex-connector 3183335208.

Signed-off-by: Simo Lin <[email protected]>
CallToolResult { is_error: true, .. } can leave error_message unset
while carrying the real failure text inside output.output (typed text
blocks). The previous fallback chain went straight to the generic
"Tool execution failed" placeholder, hiding the actual MCP-declared
error from the client.

Add a flatten_mcp_output fallback ahead of the placeholder so
mcp_call.error stays informative when the server uses the
result-blocks convention.

Comment addressed (PR #1442): coderabbitai 3183370193.

Signed-off-by: Simo Lin <[email protected]>
Bug E ("replayed structural input items lose content") fixed the
WRITE side via store_whole_item = item_type != "message", but
item_to_json kept wrapping non-listed types under {"content": …}.
Result: an image_generation_call linked to a conversation came back
as {"type":"image_generation_call","content":{"type":"image_generation_call","revised_prompt":"cat",…}}
instead of restoring revised_prompt/status at the top level.

Mirror the write rule: when item.content is an object, hoist its
top-level fields (skipping id/type/role/status which we already
populated from the ConversationItem record). For pre-fix rows that
stored an array, keep the legacy {"content": …} wrap so old data
still round-trips.

Comment addressed (PR #1442): coderabbitai 3183370204.

Signed-off-by: Simo Lin <[email protected]>
…types

The two helpers had drifted: extract_builtin_types covered
WebSearchPreview/CodeInterpreter/ImageGeneration but not FileSearch,
while request_uses_mcp_routing hard-coded the same trio independently.
A future addition to either list (notably file_search) would only
land in one place.

Make the predicate consult extract_builtin_types so they can't
diverge — adding a new builtin to the routing path automatically makes
the predicate cover it.

Comment addressed (PR #1442): coderabbitai 3183370189.

Signed-off-by: Simo Lin <[email protected]>
Round 1 of D scoped the format-registry fail-fast to MCP-laden
requests, but ctx.components.mcp_orchestrator() was still required
unconditionally on entry. Plain non-MCP requests still 500'd in
deployments without MCP wiring, defeating the original goal.

Move the orchestrator lookup into the same MCP-routing arm and
extend the resolved tuple to (mcp_servers, mcp_orchestrator,
mcp_format_registry). Plain requests no longer touch either
component.

Comments addressed (PR #1442): coderabbitai 3183370208 / 3183370213.

Signed-off-by: Simo Lin <[email protected]>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
model_gateway/src/routers/common/mcp_utils.rs (1)

250-259: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

request_uses_mcp_routing still doesn't match actual MCP routing.

Line 256 still returns true for any hosted builtin, even when no builtin MCP server is configured, so plain hosted-tool requests can still hard-fail in MCP-disabled deployments instead of falling through upstream. The same helper also stays blind to file_search because extract_builtin_types still omits that family, so configured file_search routing won't activate interception. Please derive this predicate from resolved/configured routing, not builtin presence.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@model_gateway/src/routers/common/mcp_utils.rs` around lines 250 - 259, The
predicate request_uses_mcp_routing is wrong because it returns true for any
hosted builtin via extract_builtin_types and misses file_search; change it to
consult the resolved/configured routing data instead of builtin presence: update
request_uses_mcp_routing to check the actual MCP routing configuration (e.g.,
the routing map or resolved target list the gateway uses at runtime) and return
true only if a tool in tools is explicitly routed to an MCP server (handle
ResponseTool::Mcp(_) as before) or if the resolved routing for known builtin
families (including file_search) points to an MCP endpoint; remove reliance on
extract_builtin_types for this predicate so it matches real configured routing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@model_gateway/src/routers/common/persistence_utils.rs`:
- Around line 335-349: Remove the unused _is_input parameter from the function
signature of item_to_new_conversation_item and update all call sites that pass
an explicit boolean (search for item_to_new_conversation_item(...) usages) to
omit the third argument; ensure the function body is unchanged except for
deleting the _is_input parameter and adjust any imports or references that
relied on the previous arity so compilation succeeds.

---

Duplicate comments:
In `@model_gateway/src/routers/common/mcp_utils.rs`:
- Around line 250-259: The predicate request_uses_mcp_routing is wrong because
it returns true for any hosted builtin via extract_builtin_types and misses
file_search; change it to consult the resolved/configured routing data instead
of builtin presence: update request_uses_mcp_routing to check the actual MCP
routing configuration (e.g., the routing map or resolved target list the gateway
uses at runtime) and return true only if a tool in tools is explicitly routed to
an MCP server (handle ResponseTool::Mcp(_) as before) or if the resolved routing
for known builtin families (including file_search) points to an MCP endpoint;
remove reliance on extract_builtin_types for this predicate so it matches real
configured routing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 607c7ef2-9c73-4f7d-967b-fe0152770f95

📥 Commits

Reviewing files that changed from the base of the PR and between 9bbdc28 and a5b6d24.

📒 Files selected for processing (6)
  • crates/mcp/src/inventory/types.rs
  • model_gateway/src/routers/common/mcp_utils.rs
  • model_gateway/src/routers/common/openai_bridge/transformer.rs
  • model_gateway/src/routers/common/persistence_utils.rs
  • model_gateway/src/routers/openai/responses/non_streaming.rs
  • model_gateway/src/routers/openai/responses/streaming.rs

Comment thread model_gateway/src/routers/common/persistence_utils.rs
slin1237 added 2 commits May 4, 2026 16:52
…t into entry

The previous fix lifted rmcp tool annotations into ToolEntry.annotations
via ToolAnnotations::from_rmcp_option. Production approval policy reads
those same annotations and applies conservative defaults
(destructive_hint.unwrap_or(true)), so any tool whose server provided an
annotations object without an explicit destructive_hint started getting
denied with "Tool execution denied: <tool>". This took out
brave_web_search across the openai-responses and anthropic-messages e2e
lanes (mcp_tool_result.is_error=true, multiple streaming/non-streaming
asserts).

Drop the lift in from_server_tool. Read read_only_hint directly off
tool.annotations inside build_mcp_tool_infos so the wire surface
(mcp_list_tools[].tools[].annotations.read_only) still reflects what the
server advertised, without retroactively changing the policy gating
that production was de-facto bypassing. Replaces the lift-side test
with two read-side tests (rmcp hint surfaces + absent hint defaults to
false).

Signed-off-by: Simo Lin <[email protected]>
…s_error

Bug B (3175392558) made transform_tool_output respect output.is_error
to surface failed tool executions instead of stamping them as
successful. For mcp_call this is correct — the variant has an `error`
field and persisting+replaying a fake-success was the documented
concern. For the four hosted-builtin variants
(web_search_call/code_interpreter_call/file_search_call/
image_generation_call) the behavior was over-strict:

- The wire variants have no `error` field, so status is the only
  channel and "failed" is the only signal.
- OpenAI cloud's native hosted tools never emit Failed for soft
  failures (rate-limited search, no results, transient backend errors)
  and instead return Completed with empty/limited content.
- Many MCP servers set isError=true for those soft-failure cases,
  exactly matching the soft-failure pattern OpenAI cloud handles
  silently.

The result: brave-search MCP returning isError=true for a normal
"completed but with caveats" response made the gateway emit
status=failed, breaking test_web_search_preview_streaming_events
(which asserts cloud-parity status=completed). The streaming events
fired the success path (web_search_call.completed) but the persisted
item said failed — internally inconsistent.

Restrict failed_output_item to Passthrough only; hosted-builtin
formats fall through to the success-path builders unconditionally.
The reviewer's persistence-replay concern is still met for mcp_call
(where it actually applies — that variant has the error channel).

Updates the regression test to assert the new hosted behavior. Updates
the now-stale comment in grpc/regular/responses/streaming.rs that
described the previous status=failed pattern.

Signed-off-by: Simo Lin <[email protected]>
@slin1237 slin1237 merged commit 03e8d5a into main May 5, 2026
80 of 99 checks passed
@slin1237 slin1237 deleted the fix/openai-bridge-review-followups branch May 5, 2026 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

grpc gRPC client and router changes mcp MCP related changes model-gateway Model gateway crate changes openai OpenAI router changes tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant